A procedure to automatically enrich verbal lexica with subcategorization frames

نویسندگان

  • Irene Castellón
  • Laura Alonso Alemany
  • Nevena Tinkova Tincheva
چکیده

In this paper we introduce a method for automatically assigning subcategorization frames to previously unseen verbs of Spanish, as an aid to syntactical analysis. Since there is not a consensus on the classes of subcategorization frames, we combine supervised and unsupervised learning. We apply clustering techniques to obtain coarse-grained subcategorization classes from an annotated corpus of Spanish, then evaluate these classes and we finally use them to learn a classifier to assign subcategorization frames to the verbs of previously unseen sentences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ARTÍCULO A procedure to automatically enrich verbal lexica with subcategorization frames

In this paper we introduce a method for automatically assigning subcategorization frames to previously unseen verbs of Spanish, as an aid to syntactical analysis. Since there is not a consensus on the classes of subcategorization frames, we combine supervised and unsupervised learning. We apply clustering techniques to obtain coarse-grained subcategorization classes from an annotated corpus of ...

متن کامل

Inducción de Clases de Comportamiento Verbal a partir del Corpus SENSEM

In this paper we present the construction of a classifier with the final objective of automatically assigning subcategorization frames to previously unseen verb senses of Spanish, starting from a generalization of manually annotated frames. Taking as a departure point the data base SENSEM (Fernández et al 2004), the subcategorization frames of 1161 verbal senses have been acquired. These frames...

متن کامل

Detecting Verbal Participation in Diathesis Alternations

We present a method for automatically identifying verbal participation in diathesis alternations. Automatically acquired subcategorization frames are compared to a hand-crafted classification for selecting candidate verbs. The minimum description length principle is then used to produce a model and cost for storing the head noun instances from a training corpus at the relevant argument slots. A...

متن کامل

Using Loglinear Clustering for Subcategorization Identification

In this paper we will describe a process for mining syntactical verbal subcategorization, i.e. the information about the kind of phrases or clauses a verb goes with. We will use a large text corpus having almost 10,000,000 tagged words as our resource material. Loglinear modeling is used to analyze and automatically identify the subcategorization dependencies. An unsupervised clustering algorit...

متن کامل

Towards the Fully Automatic Merging of Lexical Resources: A Step Forward

This article reports on the results of the research done towards the fully automatically merging of lexical resources. Our main goal is to show the generality of the proposed approach, which have been previously applied to merge Spanish Subcategorization Frames lexica. In this work we extend and apply the same technique to perform the merging of morphosyntactic lexica encoded in LMF. The experi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inteligencia Artificial, Revista Iberoamericana de Inteligencia Artificial

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2008